Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String-Tiling Algorithm
نویسنده
چکیده
A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm. Running Karp-Rabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect transposed substrings or substrings of one biosequence that appears rearranged in a second sequence. Repeated substrings can also be detected. Neweyes also supports a form of matching-by-group that gives the effect of different amino acid mutation matrices. Neweyes can be used in a macro mode (searching a database for a list of biosequences that are similar to a given biosequence) or in a micro mode, where two biosequences are compared and more detailed output formats are available.
منابع مشابه
Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String-Tiling Algorithm1
A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm, Running KarpRabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect tr...
متن کاملRunning Karp-Rabin Matching and Greedy String Tiling
A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm, Running Karp-Rabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect t...
متن کاملUniversity of Sheffield - Lab Report for PAN at CLEF 2010
This paper describes the University of Sheffield entry for the 2nd international plagiarism detection competition (PAN 2010). Our system attempts to identify extrinsic plagiarism. A three-stage approach is used: pre-processing, candidate document selection (using word n-grams) and detailed analysis (using the Running Karp-Rabin Greedy String Tiling string matching algorithm). This approach achi...
متن کاملExternal Plagiarism Detection using Information Retrieval and Sequence Alignment - Notebook for PAN at CLEF 2011
This paper describes the University of Sheffield entry for the 3rd International Competition on Plagiarism Detection which attempted the monolingual external plagiarism detection task. A three stage framework was used: preprocessing and indexing, candidate document selection (using an Information Retrieval based approach) and detailed analysis (using the Running Karp-Rabin Greedy String Tiling ...
متن کاملImplementation of Rabin Karp String Matching Algorithm Using MPI
Searching for occurrences of string patterns is a common problem in many applications. One of the easiest approaches is to search a pattern in a text character by character. But this method becomes very slow when we deal with long patterns. Various good solutions like Naïve string Matcher, Knuth-Morris-Pratt (KMP), Rabin Karp (RK), etc have already been presented for string matching and we sele...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings. International Conference on Intelligent Systems for Molecular Biology
دوره 3 شماره
صفحات -
تاریخ انتشار 1995